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(54) A memory usage scheme for performing wavelet processing 



(57) A system comprising a memory and wavelet 
processing logic is descrit>ed. The memory is sized to 
include lines to store a band of an image and additional 
lines. The wavelet processing logic comprises a wavelet 
transform and access logic. The wavelet transform gen- 
erates coefficients when applied to data In the memory. 
The access logic reads data from the memory into the 



line buffers to supply data stored in the memory to the 
wavelet transform and to store coefficients in the mem- 
ory, such that after data stored at a first pair of lines is 
read from memory into the buffers of the access logic. 
The access logic reuses the first pair of lines to store 
coefficients generated by the wavelet transform that are 
associated with a second pair of lines different from the 
first pair of lines. 
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Description < 
FIELD OF THE INVENTION 

5 [0001] The present invention relates to the field of compression and decompression; more particularly, the present 
Invention relates to memory usage for performing wavelet processing. 

BACKGROUND OF THE INVENTION 

10 [0002] The new JPEG 2000 decoding standard (ITU-T Rec.T.800/lSO/IEC 154441:2000 JPEG 2000 Image Coding 
System) provides a new coding scheme and codestream definition for images. Although the JPEG 2000 standard is 
a decoding standard, the JPEG 2000 specifies encoding and decoding by defining what a decoder must do. Under the 
JPEG 2000 Standard, each image Is divided Into one or more rectangular tiles. If there Is more than one tile, the tiling 
of the Image creates tile-components that can be extracted or decoded independently of each other. Tile-components 

IS comprise all of the samples of a given component In a tile. An Image may have multiple components. Each of such 
components comprises a two-dimensional array of samples. For example, a color Image might have red, green and 
blue components. 

[0003] After tiling of an image, the tile-components may be decomposed into different decomposition levels using a 
wavelet transfomnation. These decomposition levels contain a number of subbands populated with coefficients that 

20 describe the horizontal and vertical spatial frequency characteristics of the original tile-components. The coefficients 
provide frequency information about a local area, rather than across the entire image. That Is, a small numt>er of 
coefficients completely describe a single sample. A decomposition level Is related to the next decomposition level by 
a spatial fadtor of two, such that each successive decomposition level of the subbands has approximately half the 
horizontal resolution and half the vertical resolution of the previous decomposition level. 

25 [0004] Although there are as many coefftdents as there are samples, the Information content tends to be concentrated 
in Just a few coefficients. Through quantization, the information content of a large number of coefficients is further 
reduced. Additional processing by an entiropy coder reduces the number of bits required to represent these quantized 
coefficients, sometimes significantiy compared to the original image. 

[0005] The Individual subbands of a tile-component are further divided into code-blocks. These code blocks can be 
30 grouped into partitions. These rectangular an^ys of coefficients can t>e extracted Independently. The individual bit- 
planes of the coefficients in a code-block are entropy coded with three coding passes. Each of these coding passes 
collects contextual Infonmation about the bit-plane compressed image data. 

[0006] The bit stream compressed Image data created from these coding passes is grouped in layers. Layers are 
arbitrary groupings of successive coding passes from code-blocks. Although there Is great flexibility in layering, the 
35 premise is that each successive layer contributes to a higher quality image. Subband coefficients at each resolution 
level are partitioned into rectangular areas called precincts. 

[0007] Packets are a fundamental unit of the compressed codestream. A packet contains compressed image data 
from one layer of a precinct of one resolution level of one tile-component These packets are placed in a defined order 
in the codestream. 

40 [0008] The codestream relating to a tile, organized in packets, are arranged in one, or more, tile-parts. A tile-part 
header, comprised of a series of markers and marker segments, or tags, contains information about the various mech- 
anisms and coding styles that are needed to locate, extract, decode, and reconstruct every tile-component At the 
beginning of the entire codestream Is a main header, comprised of markers arui marker segments, that offers similar 
information as well as information about the original image. 

45 [0009] The codestream Is optionally wrapped in a file format that allows applications to interpret the meaning of, and 
other information about, the image. The file format may contain data k>esldes the codestream. 
[001 0] The decoding of a JPEG 2000 codestream Is performed by reversing the order of the encoding steps. Figure 
1 is a block diagram of the JPEG 2000 standard decoding scheme that operates on a compressed image data 
codestream. Referring to Figure 1 . a bitstream Initially is received by data ordering block 101 that regroups layers and 

so subband coefficients. Arithmetic coder 102 uses contextual information collected during encoding about the bit-plane 
compressed image data, and its intemal state, to decode a compressed bit stream. 

[0011] After arithmetic decoding, the coefficients undergo bit modeling in coefficient bit modeling block 103. Next, 
the codestream is quantized by quantization block 104, which may be quantizing based on a region of Interest (ROI) 
as indicated by ROI block 105. After quantization, an inverse transform is applied to the remaining coefficients via 
SB transform block 106. followed by DC and optional component transform block 107. This results in generation of a 
reconstructed Image. 

[0012] The JPEG2000 standard leaves many choices to implementers. 
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SUMMARY OF THE INVENTION 

[0013] A system comprising a memory aiKi wavelet processing logic Is described. The memory Is sized to Indudei 
lines to store a band of an image and additional lines. The wavelet processing logic comprises a wavelet transform 
5 and access logic. The wavelet transform generates coefficients when applied to data In the memory. The access logic 
reads data from the memory into the line buffers to supply data stored in the memory to the wavelet transform and to 
store coefficients in the memory, such that after data stored at a first pair of lines is read from memory into the buffers 
of the access logic. The access logic reuses the first pair of lines to store coefficients generated by the wavelet transform 
that are associated with a second pair of lines different from the first pair of Ones. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0014] The present Invention will be understood more fully from the detailed description given below and from the 
accompanying drawings of various embodiments of the invention, which, however, should not be taken to limit the 
IS Invention to the specific embodiments, but are for explanation and understanding only. 
[0015] Figure 1 1s a block diagram of the JPEG.2000 standard decoding scheme. 
[0016] Figure 2 illustrates one embodiment of an organization for an Image In memory. 

[0017] Figures 3A-F illustrate a transform memory organization for various levels depicting conceptually how coef- 
ficients may be stored for the forward (Rgures A-C) and Inverse (Figures 3D-F) transforms. 
20 [0018] Figures 4A and B illustrate embodiments of a single memory where the input image data and the various 
decomposition levels of the image data can be stored during forward and inverse transforms, respectively. 
[0019] Figure 5 Illustrates one embodiment of the process of handling the input data. 
[0020] Figure 6A illustrates a system having a progression order conversion parser. 

[0021] Figure 6B Illustrates a progression converter converting from a resolution progressive codestream to a quality 
25 progressive codestream. 

[0022] Figure 7A shows multiple ways to convert a codestream from one progression order to another. 
[0023] Figure 7B shows one embodiment of simplified conversion paths to convert a codestream from one progres- 
sion order to another. 

[0024] Figure 8 illustrates one embodiment of a process for performing progression order conversion. 
30 [0025] Figure 9 illustrates a decoder that selects portions of a codestream based on sideband information. 
[0026] Figure 10 is a flow diagram of a process for using layers when decoding. 
[0027] Figure 1 1 is a flow diagram of one embodiment of an editing process. 

[0028] Figure 12 illustrates a bell-shaped curve of a range of values that are quantized to a particular value. 
[0029] Figure 13 is a flow diagram of one embodiment of a process to reduce flicker. 
35 [0030] Figure 14 Illustrates one embodiment of an encoder (or portion thereof) that performs the quantization to 
reduce flicker. 

[0031] Figure 15A illustrates a process for performing rate control. 

[0032] Figure 15B illustrates an exemplary number of layers that may be subjected to first and second passes. 
[0033] Figure 16 Illustrates one embodiment of the process for accessing the groupings of tile parts. 
[0034] Figure 17 and 18 Illustrate quantizers for one component for a three level 5.3 transform. 
[0035] Figure 19 Illustrates an example of IHVS weighted quantization. 
[0036] Figure 20 Is a block diagram of one embodiment of a computer system. 
[0037] Figure 21 Illustrates an example progression with tile parts for a single server. 
[0038] Figure 22 illustrates an example of layering for a 5.3 irreversible transform. 
45 [0039] Figure 23 illustrates an example in which transform has 5 levels and the data is divided up into layers 0-3. 
[0040] Figure 24 illustrates one example of a situation in which flicker may t>e avoided in which values In first and 
third frames are used to set the value In the second frame. 

[0041] Figure 25 Is a block diagram of a prior art decoding process that Includes color management 
[0042] Figure 26 Illustrates one embodiment of a non-preferred camera encoder. 
50 [0043] Figure 27 Illustrates one embodiment of a simpler camera encoder. 

[0044] Figure 28 Is a flow diagram of one embodiment of a process for applying an inverse transform with clipping 
on partially transformed coefficients. 

DETAILED DESCRIPTION OF THE PRESENT INVENTION 

55 

[0045] Improvements to compression and decompression schemes are described. It is a purpose of the techniques 
and implementations described herein to use choices in JPEG 2000 to make high speed, low cost, low memory and/ 
or feature rich Implementations. 
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[0046] In the following description, numerous details are set forth in order to provide a thorough explanation of the 
present invention. It wfll be apparent, however, to one sidlled in the art, that the present invention may be practiced 
without these specific' details, in other instances, well-known structures and devices are shown in block diagram forni. 
rather than in detail, in order to avoid obscuring the present invention. 

5 [0047] Some portions of the detailed descriptions which follow are presented in terms of algorithms and symbolic 
representations of operations on data bits within a computer memory. These algorithmic descriptions and representa- 
tions are the means used by those skilled in the data processing arts to most effectively convey the substance of their 
woric to others skilled in the art An aigortihm is here, and generally, conceived to be a self-oonsistent sequence of 
steps leading to a desired result The steps are those requiring physical manipulations of physical quantities. Usually. 

10 though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, trans- 
ferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of 
common usage, to refer to these sigrials as bits, values, elements, symbols, characters, terms, numbers, or the like. 
[0046] It should be t>ome in mind, however, that ail of these and similar terms are to be associated with the appropriate 
physical quantities and are merely convenient lak>els applied to these quantities. Unless specifically stated otherwise 

IS as apparent from the following discussion, it is appreciated.that throughout the description, discussions utilizing terms 
such as "processing* or "computing" or "calculating* or "determining" or "displaying" or the like, refer to the action and 
processes of a computer system, or similar electronic computing device, that manipulates and transforms data repre- 
sented as physical (electronic) quantities within the computer system's registers and memories into other data similariy 
represented as physical quantities within the computer system memories or registers or other such information storage, 

20 transmission or display devices. 

[0049] The present invention also relates to apparatus for performing the operations herein. This apparatus may be 
specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated 
or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer 
readable storage medium, such as. but is not limited to. any type of disk including floppy disks, optical disks, CD-ROMs. 

25 and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, 
magnetic or optical cards, or any type of media suitable for storing electronic Instructions, and each coupled to a 
computer system bus. 

[0050] The algorithms and displays presented herein are not inherently related to any particular computer or other 
apparatus. Various general purpose systems may be used with programs in accordance with the teachings herein, or 
30 it may prove convenient to construct more specialized apparatus to perform the required method steps. The required 
structure for a variety of these systems will appear from the description below. In addition, the present invention is not 
described with reference to any particular programming language. It will be appreciated that a variety of programming 
languages may be used to implement the teachings of the invention as desoibed herein. 

[0051 ] A machine-readable medium includes any mechanism for storing or transmitting information in a form readable 
35 by a machine (e.g., a computer). For example, a machine-readable medium includes read only memory ("ROM"); 
random access memory ("RAM"); magnetic disk storage media: optical storage media; flash memory devices: electrical, 
optical, acoustical or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.); etc. 

Overview 

40 

[0052] The following descriptions relate to Implementations or novel ways to take advantage of the flexibility of JPEG 
2000 or other coding schemes with similar features. 

Memory Usage for Low Memory and Fast Burst Access 

46 

[0053] Figure 2 shows one embodiment of an organization for an image in memory 201 . Referring to Figure 2, only 
the "tile height" raster lines, or a band of the image, are in memory 201, not the whole image. Thus, the amount of an 
Image in memory 201 is equal to the image widtii multiplied by the tile height Inside the band of the image is at least 
one tile, such as tile 210. 

so [0054] The wavelet transform processing logic 202 includes memory access logic 202A to read data from and store 
data to memory 201 to enable wavelet transform 202B to be applied to the data (image data or coefficients depending 
on the level of coefficient). Wavelet processing logic 202 may comprise hardware, software or a combination of both. 
[0055] In one embodiment access logic 202A accesses the tile with four parameters: a pointer or Index to the start 
of the tile in memory, the width of the tile, the height of the tile, and the line offset to get from the start of one line to 

55 another (the image width). Altematively, access logic 202A accesses memory 201 using a pointer or index to the end 
of the tile instead of the width of the tile. 

[0056] In one embodiment in order to access for each line of a tile or a portion of a line of an image to perform some 
function F, the following process may be used. 
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line = Start 

fbry =5 0 to tile^height - 1 
5 forx = 0 to tile_width-l 

. perform function F wiflt line[x] 
line = line -i* line.offeet 

10 One of the functions F may include applying a wavelet transform on pairs of lines. Also another function F may be a 
DC level shift, multiple component transform. 

[0057] Such a process would be performed by processing logic that may comprise hardware (e.g., dedicated logic, 
circuitry, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a oombf- 
nation of both. 

IS [0058] In one embodiment coefficients from a subband are accessed using a similar scheme with a starting point, 
width, height and line offset. Because rows of coefficients are stored together in memory, rows may t>e accessed 
efficiently when stored in cache, burst accessible memories or memories that are wider than one coefficient. 
[0059] Figures 3A-C show a transform memory organization for various levels depicting conceptually how coefficients 
may be stored. All LH. HL and HH coefficients (using the nomlndature of ITU-T Rec.T.800/lS0/iEC 1 54441 :2000 JPEG 

20 2000 Image Coding System) are coded. These coefficients are represented by dotted lines in Figures 3B and C. input 
lines of input tile 301 and U. coefficients (shown as solid lines in Figures 3B and 3C in successive levels) only need 
to be stored temporarily while computing the transform with the exception of the final transform level's LL coefTicients 
which are coded. When a transform is used that does the horizontal and vertical transforms In one pass and uses line 
buffers, once a pair of Input rows has been completely read (input lines or LL coefficients), the space used by lines 

25 can be reused. 

[0060] Figures 3A-C show input tile 301 . level 1 (LI K302) and level 2 (L2)(303) memory areas aligned with an offset 
to indicate how reuse might k>e accomplished In one embodiment The addition of two rows, rows 312 and 313. to the 
memory space used to hold input tile 301 , is needed to generate the LI coefficients when reusing the memory for input 
tile 301 for LI coefficients. The addition of four rows, rows 341-342, to the memory space used to hold the LI coefficients 
30 is needed to generate the L2 coefficients when reusing the memory storing the LI coefficients for L2 coefficients. (Note 
that there are two rows between rows 341 and 342 that are wasted space.) The additional lines are preferably behind 
the direction the wavelet transfonn is t>eing applied to the information in the memory. 

[0061] More spectficaily, a pair of input rows input tile 301 corresponds to one row of each of LL, LH, HL and HH 
coefficients at level 1 . resulting from the application of a transform to two different lines and the results of applying the 

35 wavelet transform being written into lines of the memory. For example, the results of applying a wavelet transform to 
input rows,310 and 311 are the coefficients in portions of rows 312 and 313 of LI coefficients (302). For example, LL 
coefficients 321 of row 312 corresponds to the LL coefficients (solid line) of level 1, HL coefficients 322 of row 312 
corresponds to the HL coefficients of level 1, LH portion 323 of row 313 corresponds to the LH coefficients of level 1, 
and HH portion 324 corresponds to the HH coefficients of level 1. Note that the level 1 coefficients from the first two 

40 input lines are stored in two extra rows at the top of the memory with the remaining level 1 coefficients t>elng written 
into the locations storing the data of input tile 301 to reuse the memory. The width and height for each type of coefficient 
(e.g., LH, HL, HH) for each subband of level 1 coefficients Is half that of input tile 301 . The line offset from the LL row 
to the next LL row for level 1 (e.g., the distance from row 312 to row 314 In Figure 3B) is twice the tile width (since 
each coefficient row is from an area corresponding to two lines). 

45 [0062] Similarly, the results of applying the wavelet transform to two rows of LL coefficients at level 1 (solid lines) 
are the coefficients in two rows namely LL coefficients (331), LH coefficients (332), HL coefficients (333) and HH 
coefficients (334) at level 2. The width and height for level 2 coefficients is a quarter that of input tile 301 . The line offeet 
for level 2 is four times the tile width (since each coefficient row Is from an area corresponding to two level 1 LL rows 
or four input lines). Thus, four extra lines of memory are needed to use the same memory that is storing the Input tile 

so to store the L2 coefficients. Note that If a third decomposition level was being performed, an additional 8 lines would 
be needed. Thus, in this example, a total of 14 extra lines are needed to enable reuse of the memory that stores an 
input time and has two levels of decomposition applied thereto. A general fonmula may be used to determine the nurhber 
of extra lines is as follows: 

[0063] To access subbands, such as the LL, LH. HL and HH subbands, only a starting pointer and the offset between 
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rows/lines are necessary. The height and width are also needed to know when to stop when accessing a tHe. 
[0064] As the number of decomposition levels increases, some rows at the bottom of memory become unused. That 
is, the ilnes of memory below the L1 coeincients after the first decomposition level become unused, the lines of memory 
below the L2 coefficients after the second decomposition level become unused, etc. In one embodiment, this extra 
space may be reused. 

[0065] Rgures 3D-3F illustrate the corresponding inverse transform memory usage in which additional lines store 
the results of applying an inverse transform and those additional lines are in the memory behind the direction the 
Inverse transform Is being performed. 

[0066] Rgure 4A shows one embodiment of a single memory where the input and the various levels can be stored 
during application of a forward transform. Referring to Rgure 4A, locations for the input tile, level 1 coefficients, level 
2 coefficients, and level 3 coefficients is shown with the added 2, 4 and 8 ilnes respecth^ely. Figure 4B shows a similar 
single memory emt>odiment where the input coefficients of various levels of the transform can be stored along with the 
output during application of an inverse transform. 

[0067] Table 1 shows the amount of memory required for various transform levels for a 256x256 tile for separate 
memories and reused memory. 

Table 1 



level 


Separate memory (bytes) 


reused memory (bytes) 


1 


256x256 = 


65.536 


2x256 = 


512 


2 


128x128= 


16.384 


4x256 = 


1,024 


3 


64x64= 


4.096 


8x256= 


2.048 


4 


32x32 = 


1,024 


16x256 = 


4.096 


5 


16x16= 


256 


32x256= 


8.192 


6 


8x8 = 


64 


64x256 = 


16.384 



[0068] For reused memory, the amount listed is the additional new memory used for that level. For this example, 
reusing memory for levels 1 , 2 and 3 saves memory. Level 4 may use a separate memory. 

[0069] The memory for levels 4, 5 and 6 could be placed in a single memory after level 3 has been generated or In 
a completely different and separate memory. The amount of memory necessary is 38x32, which Is less than 5x256. 
Because there are two unused Ilnes after generating the level 1 coefficients (i.e., the memory that stored the last two 
lines of input data), a small memory savings can be achieved by letting the levels 4. 5 and 6 reuse these two lines. 
This is particulariy important because the number of additional lines for levels 4, 5, and 6 is 16, 32 and 64, and the 
extra space between the lines will be twice as far and half as wide as the level before. 

[0070] In one embodiment, coefficients from levels 4, 5. and 6 are padced in a smaller memory structure, such as 
storage area 450 in Figure 4. Referring to Figure 4, the level 4 coefficients are stored in an area having a height equal 
to the tile height divided by 8 (2^ where 3 corresponds to the number of levels) and a width equal to the tile width w 
divided by 8 (2^ where 3 corresponds to the number of levels previously stored elsewhere). An additional two lines 
451 are all that is needed to store level 5 coefficients in the same necessary storage area. Similariy, an additional four 
lines is all that is necessary to accommodate using this rnemory storage area for the Jevel 6 coefficients. Note that no 
lines are slapped when storing the coefficients. In one embodiment in which a 256x256 tile is being processed, the 
extra 5 lines at the bottom of storage area 430. two lines 421 and approximately 4.75 lines 422 are used to accommodate 
storage area 450. As shown, the approximate by three lines 422 represent allocated memory or in addition to that 
necessary to store the input tile. In this manner, the storage area for the input tile is almost completely reused. 
[0071] In one embodiment, to use a very little, or potentially minimum, memory, level 6 is stored separately from 
levels 4 and 5. However, this only saves 64 bytes of memory. 

[0072] A memory a littie smaller than 273x256 can hold ail the transform coefficients for a 256x256 tile. This Is less 
than 7% more than a true in-place memory organization. Unlilce an in-place memory organization, extra copies are 
avoided while simultaneously iceeping the rows packed together for fast access. 

[0073] Table 2 shows another example of using separate versus reused memory for 128x128 tiles. For this size, the 
first three transform levels can reuse memory in a 142x128 buffer. 



Table 2 



level 


Separate memory (bytes) 


reused memory (bytes) 


1 


128x128 = 


16,384 


2x128= 


256 


2 


64x64= 


4.096 


4x128= 


512 
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Table 2 (continued) 



level 


Separate memory (bytes) 


reused memory (bytes) 


3 


32x32= 1,024 


8x128= 1 1024 



[0074] In one embodiment, a decision to use in-place memory or new memory Is a function of tile height and transform 
level. Such a decision may be based on the following: 

If tile height > 2(3"'ev8i-2)^ then use In-place method 

10 

if tile height = 2(?^'^^^\ then either may be used 
If tile height < 2(3**«*»'"2)^ then use new memory 
IS To Illustrate the application of the decision. Table 3 below: ... 



Table 3 



20 



25 



level 


2'^(31evel-2) 


1 


2 


2 


16 


3 


128 


4 


1024 


5 


6192 



[0075] In some applications, adapting the memory organization to the tile height Is Inconvenient A single fixed mem- 
ory organization can be used. Tile sizes smaller than 128x128 typically result In bad compression performance, so 
30 would typically not be used. White tile sizes bigger than 1Kx1K can be used for very large images, this does not 
significantly improve compression and the large amount of memory required would typically be burdensome. Therefore, 
assuming a tile height t>etween 128 and 1024 Inclusive and using in-place memory for 3 levels of the transform is a 
good heuristic. 

[0076] Decoding is similar in that the results of applying an inverse transform are written ahead of where the decoding 
35 processing logic Is reading, with the only notable difference being that the start is from the highest level to the lowest 
level, such as level 6 to level 1 in the example above. In such a case, the Input tile ends up at the top of the memory 
structure. The extra lines to accommodate the memory reuse are In decreasing order. For example, using the structure 
of Figure 4B. 8 lines would be necessary to create the L2 coefficients from the L3 coefficients, 4 extra lines would be 
necessary to create the LI coefficients from the L2 coefficients and 2 extra lines would be necessary to create the 
40 input tile from the LI coefficients. 

[0077] in one embodiment, to handle input tile data, a color conversion may be performed on the data prior to en- 
coding. Figure 5 Illustrates one embodiment of the process of handling the input data. Referring to Figure 5, color input 
pixels are received In raster order. These color pixels may be In RGB, YCrCb, CMY, CMYK, grayscale, etc. The color 
input pixels may be stored as tiles in a memory, such as memory 501, by band (or other forms). 
4S [0078] Pixels from storage 501 or received directly form the input undergo color conversion and/or level shifting, with 
the resulting outputs being stored in one coefficient buffers 502i-502|^. That is, once the color conversion has been 
completed on each tile, it Is stored in one of the coefficient buffers 502^ -502^, and then the next tile can fc>e processed. 
In one embodiment, there is one coefficient buffer for each component. 

[0079] Coefficient buffers 502^-502^^ are used by the transform in tine manner described above to perform the wavelet 
50 transform while reusing memory. Thus, coefficient buffers 502^-502^ are both Input and output to wavelet transform. 
[0080] After the transform Is applied to coefficient buffers 502^-502|v|, the context model 503 and entropy coder 505 
can perform further compression processing on the already transformed data. The coded data Is buffered in coded 
data memory 505. 

[0081] While performing the further compression processing on one tile, the transform may be applied to another 
55 tile. Simiiariy, any or all the operations may t>e performed on multiple tiles at the same time. 
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Progression Order Conversion 

[0082] In the JPEG2000 standard, data in a compressed codestream can be stored in one of the five progression 
orders. The progression order can change at different points in the codestream. The order is defined by embedded 

s "for layers* on layers, precincts, resolution, and components. 

[0083] Five progression orders are descrit>ed in the standard In Table A-1 6 of the JPEG 2000 standard. They arelayer- 
resolution-component-posltion progression (LRCP), resolution-layer-component-position progression (RLCP), resolu- 
tion-position-component-layer progression (RPCL), position-component-resolution-tayer progression (PCIRL), compo- 
nent-position-resolution-layer progression (CPRL). 

10 [0084] The order may be defined in the COD or POC markers of the JPEG 2000 standard. The Coding style default 
(COD) martcer is defined by the JPEG 2000 standard and describes the coding style, number of decomposition levels, 
and layering that Is the default used for compressing all components of an image (if in the main header) or a tile (if in 
a tile-part header). The Progression order change (POC) maricer describes the lx>unds and progression order for any 
progression order other than that specified in the COD maricer segments in the codestream. The Padcet Length Main 

IS i-leader (PLM) indicates a list of padcet lengths in tile-parts for every tile part In order and the Padcet Length, Tile-part 
header (PLT) indicates tile padcet lengths In a tile-part and Indicates where the data is in the codestream. 
[0085] The JPEG 2000 standard In section B.1 2 only specifies how paclcets of compress data are formed for a given 
progression order. It does not describe how data should be converted from one progression order to another progression 
order. 

20 [0086] in one embodiment a progression order converting parser converts a codestream to a desired progression 
order based on the user Input without decoding the data and then encoding It again. Figure 6A Illustrates a system 
having such a parser. Refenring to Rgure 6A. parser 601 receives requests from a dient for a particular progression 
order. The dient may be viewing a web page and selects a particular link, in response to the request, parser 601 
accesses server 602 to obtain the codestream assodated with full Image 603 from memory 604 and converts the 

25 codestream into a different progression order based on the request. The request indicates the progression order by 
using an optional command (e.g., R12L (Resolution-layer progression to Layer Progression)). The progression order 
that Is described may be based on layer, resolution, component, prednct, or tile. 

[0087] Figure SB illustrates the progression converter converting from a layer progressive c»>destream (LRCP) to a 
resolution progressive (RLCP) codestream. The progression orders map directly to each other. 

30 [0088] Figure 7A shows multiple ways to convert a codestream from one progression order to another. Referring to 
Figure 7A. each of the five progressions (LRCP, IRLCP, RPCL, CPRL, and PCRL) are shown with paths to each of the 
others, such that ail progressions are shown. In one embodiment, the parser causes all conversions to go through the 
layer progression first and then to a selected conversion. Figure 78 shows one embodiment of such simplified conver- 
sion paths in which the numt>er of required mappings is reduced from 10 (as in Rgure 7A) to 4. However, any one of 

35 the five progression orders cx>uld be used as the one to which all are cxsnverted t>efore arriving at the selected order. 
The conversion technique described herein simplifies source codes in that the number of lines of source code is much 
less than the multiple ways of conversion. This results in less debug time and fewer memory and run-time variables. 
[0089] To perform the conversion, the order of the packets In the codestream must be reordered. The packets are 
labeled by their sequential order In the codestream. Maricers may Indicate the starting point of the data, the length of 

40 the data (or alternatively the endpoint of the data) and how the data should be handled. For example, the indication of 
how the data is to be handled may indicate whether the data is to be deleted, whether the data is to be truncated, or 
some other operation to be performed on the data. Such handling information may also come from rate distortion 
infomnation, such as may be provided in a PLT/PLM and/or the PPT/PPM maricer sets of the JPEG 2000 standard. In 
this manner, the codestream may be truncated without changing the packet header. 

45 [0090] In one embodiment, a list, array, or other structure (such as reordering structure 601A) is built by Indicating 
the portion of data in each packet Using this structure, the packets may be reordered. 

[0091] Figure 8 illustrates one embodiment of a process for performing progression order conversion. The process 
Is performed by processing logic that may comprise hardware (e.g., dedicated logic, drcultry, etc.), software (such as 
Is run by. for example, a general purpose computer or dedlc:ated machine), or a combination of both. 

so [0092] Referring to Figure 8. the process begins by processing logic building a list from headers in the packets 
(processing block 801 ) and optionally martdng list Items "delete" for quantization (processing block 802). Next, process- 
ing logic reorders the list to map the original progression to a desired progression (including handling input and output 
with progressions specified with POC markers (bounds on the progression order) (processing block 803). Thereafter, 
processing logic outputs coded data based on reordered list (proc^essing block 804). 

55 [0093] Therefore, the combination of re-ordering and parsing allows specification of the desired ordering and reso- 
lution, quality, etc. 
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A ProgTBSsion Order Conversion Example 

[0094] The foltowing is an example showing how packets are ananged in a codestream. The codestream was formed 
based on 2 components. 2 layers, 3 decomposition levels, and layer progression. 
5 [0095] Table 4 shows the packet order, length and association index of packets in the example. The packet order 
column shows the sequential order of packets placed In a codestream. The length Indicates the length of the packets. 
The association Index shows the resolution, layer, component and precinct of the packet 

[0096] For example. packet[0] Is the first packet In the codestream after the first tile header It has a length of 589 
bytes. Association index RwLxCyPz indicates ttie packet belongs to resolution w. layer x, component y and prednct z. 



10 

. Table 4 





Packet order 


Length 


Association Index 




packet[0] 


length=589 


ROLOCOPO 


15 


packet[1] 


length=589 


R0L0C1P0 




packet[2] 


lengtti=924 


R1L0C0P0 




packet[3] 


lengtti'924 


R1L0C1P0 




packet[4] 


length=1602 


R2L0C0P0 




packet[5] 


lengtt)=1602 


R2L0C1P0 


20 


packet[6] 


length=733 


R3L0C0P0 




packet[7] 


lengtii=733 


R3L0C0P0 




packet[8] 


iength=535 


R0L1C0P0 


25 


packet[9] 


length=535 


R0L1C1P0 


packet[10] 


length=1523 


R1L1C0P0 




packet[11] 


length=1523 


R1L1C1P0 




packet[12] 


lengtti=5422 


R2L1C0P0 




packet[13] 


length=5422 


R2L1C1P0 


30 


packet[14] 


length^ 16468 


R3L1C0P0 




packet[15] 


length=16468 


R3L1C1P0 



[0097] In tills codestream. packets are grouped based on the layer in which they reside. The first 8 packets belong 
to Layer 0. The following 8 packets belong to Layer 1 . 

[0098] Using the conversion process described herein, the above codestream is converted to resolution layer pro- 
gression. The following shows how the above packets are re-ordered. 

[0099] After the layer progressive codestream is converted to resolution progression, in the new codestream. packets 
are grouped based on resolution. Such a grouping Is shown In Table 5. The first 4 packets belong to resolution 0. the 
next 4 packets to resolution 1 , and so on. 



Table 5 



Previous Packet order 


Packet order 


Length 


Association index 


0 


packet[0] 


length=589 


ROLOCOPO 


1 


packet[1] 


lengtii=589 


R0L0C1P0 


8 


packet[2] 


length=535 


R0L1C0P0 


9 


packet[3] 


length=535 


R0L1C1P0 


2 


packet[4] 


lengtfi=924 


R1L0C0P0 


3 


packet[5] 


lengtti=924 


R1L0C1P0 


10 


packet[6] 


lengtii=1523 


R1L1C0P0 


11 


packet[7] 


iengtti-1523 


R1L1C1P0 


4 


packet[8] 


lengtti=1602 


R2L0C0P0 


5 


packet[9] 


lengtii=1602 


R2L0C1P0 


12 


packet[10] 


iengtti=5422 


R2L1C0P0 
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Table 5 (oontinued) 



Previous Packet order 


Packet order 


Length 


Association Index 


13 


packet[11] 


length=5422 


R2L1C1P0 


6 


packet[12] 


length=733 


R3L0C0P0 


7 


packet[13] 


length=733 


R3L0C1P0 


14 


packet[14] 


length=16468 


R3L1C0P0 


15 


packet[15] 


length=16468 


R3L1C1P0 



One Embodiment of a Conversion Algorithm 
[0100] Resolution to Layer Progression 

n = 0; 

for(l=0;l<layer;l++){ 

for(r=0;r<resolution+l;r++){ 
for(c=0;c<component;c++) { 

new_packet[nl = old_packet[l*component + r*layer*component + 
c]; n++; 



} 

} 

} 



Layer to Resolution Progression 

n = 0; 

for(r=0,T<resolution+l,T-M-){ 
for(l=0;l<layer;l-i-+) { 
f or(c=0;c<component;c++) { 
new_packet[n] = old_packet[r*component + 
l*(resolution+l)*component + c]; 
n++; 

) 

1 

) - 

where layer = the number of layers in a codestream, 

resolution = the number of decomposition levels in a codestreaun, and 

component = the niimber of components in a codestream 



Data Hiding (Sideband information) in JPE62000 Coding 

[0101] Bit hiding allows sideband information to be transmitted without increasing the file size. Sideband infonnation 
that does increase file size but does not brealc nah^e decoders might also be valuable (although the COM marlcer 
defined by the JPEG 2000 standard might be used instead). 
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10 



IS 



20 



25 



30 



[01 02] Some marker segments, padcet headers and packets are padded out to the nearest byte. Examples of the 
JPEG 2000 marker segments indude PPM. PPT, PLM, and PLT. In addition, some marker segments can be longer 
than they need to be including QCD, QCC, and POC. In all of these cases, the padded data values are not defined. 
[0103] Several proprietary coding schemes could use this semi-randomty located undefined data to provide a number 
of important types of information induding, but not limited to, decoding and filtering hints, ownership, segmentation 
hints, and so on. A hint might indude an Index to a particular enhancement scheme. For example, if it is known that 
an image is mostly text, a value may be sent that indicates that a first post-processing fitter is to be used. On the other 
hand, if the area Is mostly a graphic image, then a value may be sent that indicates that a second post-processing 
fitter is to be used. 

0104] The following are places where bits may be hidden or sideband information may be stored in the codestream. 

arithmetic coder (AC) temnination (without predictable termination) 
end of packet header rounding to byte 
after last packet before next tile 
tag tree construction by not always using minimum 
packet header Lblock signalling 

LSB parity for codeblocks (refinement pass only, deanup pass only, all) 
QCD, QCC extra subbands, POC. 

[0105] For example, with respect to hiding data using AC termination. 0 to 7 bits are provided, at least, everytime 
ttie coder is terminated. However, this could be extended for a few bytes. These extra bits and bytes may be used for 
sending extra information. 

[0106] With respect to each padcet header, the end of a packet header is rounded to a byte boundary. Therefore, 
tiiere may be 1 to 7 bits that may be available for sending extra information at times when rounding would have been 
necessary. Similarly, each packet is rounded to a byte boundary, thereby providing 1 to 7 bits (assuming that rounding 
would have been necessary). Also the last packet in a tile-part can fc>e extended a few bytes. These extra bytes may 
be used to send additional Information. 

[0107] The length of the compressed data for a code-block can be given in the packet header with a non-minimum 
representation. The choice of representation (e.g., a non-minimum representation) could k>e used for Indicating other 
information. 

[01 08] With respect to tag tree data hiding, packet headers of ttie JPEG 2000 standard use tag trees for coding first 
inclusion and zero bitplane Information. When there are multiple codeblocks. tag trees are like a quadtree of minimum 
values. For example, in the case of 1 6 codeblocks in a 4x4 arrangement in a packet, the arrangement may be as follows: 



35 



40 



10 


7 


12 


16 


3 


20 


21 


5 


B1 


45 


5 


9 


18 


8 


12 


24 



An example tag tree, which is minimal for ttie 4x4 arrangement above is as follows: 



45 



3 


0 


2 


7 


4 


7 


10 




5 


2 


0 


17 


16 


0 








73 


37 


0 


4 








10 


0 


• 7 


19 



50 



in which "3" is added to every codeblock's value, and "0", "2", "5" and "2" are each added to the 4 con-esponding 
codeblocks. Finally, ttiere is one value per codeblock. That is, the minimal tag tree is created by taking the first 2x2 
group in the 4x4 an^ngement above and look at minimum value Is out of the four values. In this case, for the 2x2 block 



55 



10 


7 


3 


20 



the minimum value Is 3. This is then performed on the other 2x2 blocks. Then these identified minimum values are 
evaluated again to determine their minimum, which would be "3" in the example. Then the minimum value is subtracted 
from the four minimum values to create the following 
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0 
5 



Then, for the remaining numbers In the 4x4, the number 3 is subtracted from each value along with the value in the 
2x2 that corresponds to the particuiar value in the 4x4 anBngement. thereby resulting in the tag tree above. 
The first row adds up as follows: 



10 



10=3+0+7 



IS 



7=3+0+4 



12=3+2+7 



20 



25 



30 



35 



40 



45 



SO 



55 



15=3+2+10 

[0109] A variable length code may be used that effidentty represents small numbers. 
[0110] An example of a tag tree that is not minimal is as follows: 



1 3 
63 



7 
0 

73 
10 



4 
17 
37 
0 



7 

16 
0 
7 



10 
0 
4 
19 



(Note that representing "3". "0", "2", "5" and "2" might use less bitstream data than "2". "1", "3*. "6" and "3".) 



[01 11] Once a tag tree representation has been made, a determination can be made as to whether the representation 
is minimal or not based on whether there is a zero in the 2x2 blodc. Therefore, this information is hidden. For example, 
the 1 bit blocic represents the 1 in the 2x2 block above indicates it is not part of a minimal tag tree, but can be used to 
convey some particular information to a decoder. Likewise if a 2 was the minimal value in the 2x2 block, such a fact 
may convey different information to a decoder. 

[0112] The JPEG 2000 POC, QCD, and QCC markers can have redundant entries. It is as if the codestream were 
quantized and the markers were not rewritten. For example, the QCD and QCC markers have values for a number of 
subbands specified by the syntax of the marker. If there are fewer subbands actually coded in the bitstream. data may 
be hidden in the values used for the missing subbands. The redundant entries may t>e replaced and used for hidden 
or sideband information. 

[0113] The hidden or sideband information may include post-processing hints (such as. for example, sharpen this 
tile with a specified filter or strength, or smooth, or perform optical character recognition (OCR) on this region, etc.), 
decoding hints, security (such as. for example, an encryption key for decoding the remainder of the image or another 
image, etc.) codestream identification (such as, for example, labeling POTUS as the originator of the file, etc.) and/or 
other information. 

Use of Layers When Encoding 

[0114] Layers are part of the JPEG standard. In one embodiment, sideband information, possibly in a COM mariner, 
is used by the decoder to allow selecting of layers during decoding. The sideband information may be used to select 
layers for postcompression quantization to meet rate/distortion targets for different viewing distances, different reso- 
lutions, different regions of interest, different frequency content for analysis (e.g., finding edges of text). 
[0115] in one embodiment the layers are predefined based on rate. For example, the first layer represents a 1-bit 
per pixel image, while the second layer represents a 2-bit per pbcel Image, etc. Therefore, the layers run from the lowest 
quality to the highest quality. Likewise, target rates can be met for lower resolutions as well. 

[0116] The sideband information may be stored in a marker segment of the codestream. In one embodiment, the 
JPEG 2000 comment (COM) marker is used to provide information about the layers. Specifically, the COM marker may 
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be used to indicate the number of bytes for each resolution and/or rate across the entire image or a reiative numt)er 
of bytes for each additional layer. Table 6 indicates each layer and its resolution in the number of bytes across the tile 
in an image. Such a table may have distortion values instead. 



Table 6 



lev=0 


layer=0 


comp=0 


bytes=529 


lev=0 


layer=0 


comp=1 


bytes=555 


lev=0 


layer=0 


comp=2 


bytes=493 


lev=0 


layep=1 


comp=0 


bytes=129 


lev=0 


layer=1 


comp=1 


byte s= 130 


lev=0 


layer=1 


comp=2 


bytes=123 


lev=0 


layer=2 


comp=0 


byte 8=7 


lev=0 


layer=2 


comp=1 


bytes=8 


lev=0 


layer=2 


comp=2 


bytes=12 


lev=0 


layer=3 


comp=0 


bytes=1 


lev=0 


layer=3 


comp=1 


bytes=1 


lev=0 


layer=3 


comp=2 


bytes=129 


lev=1 


layer=0 


comp=0 


bytes=705 


lev=1 


layer=0 


comp=1 


bytes=898 


lev=1 


layer=0 


comp=2 


bytes=712 


lev=1 


!ayer=1 


comp=0 


bytes=146 


lev=1 


layer=1 


comp=1 


bytes=114 


lev=1 


layer=1 


comp=2 


bytes=116 


lev=1 


layer=2 


comp=0 


bytes=224 


lev=1 


layer=2 


comp=1 


bytes=250 


lev=1 


layer=2 


comp=2 


bytes=263 


lev=1 


layer=3 


comp=0 


bytes=201 


lev=1 


layer=3 


comp=1 


bytes=212 


lev=1 


layer=3 


comp=2 


bytes=200 


lev=2 


layer=0 


comp=0 


bytes=889 


lev=2 


layen=0 


comp=1 


bytes=1332 


lev=2 


layer==0 


comp=2 


bytes=1048 


lev=2 


layer=1 


comp=0 


bytes=240 


lev=2 


layer=1 


comp=1 


bytes=329 


!ev=2 


layer=1 


comp=2 


bytes=328 


lev=2 


iayer=2 


comp=0 


bytes=599 


lev=2 


layer=2 


comp=1 


bytes=767 


lev=2 


layer=2 


comp=2 


bytes=725 


lev=2 


layer=3 


comp=0 


bytes=335 


lev=2 


layer=3 


comp=1 


bytes=396 


lev=2 


layer=3 


comp=2 


bytes=420 


lev=3 


layer^O 


comp^P 


bytes- 1 


lev=3 


layer=0 


comp=1 


bytes=395 


Iev=3 


layer=0 


comp=2 


bytes=402 


lev=3 


layer=1 


comp=0 


bytes=251 


lev=3 


layer=1 


comp=1 


bytes=450 


lev=3 


layer=1 


comp=2 


bytes=582 


lev=3 


layer=2 


comp=0 


bytes=525 


lev=3 


layer=2 


comp=1 


bytes=990 


iev=3 


layer=2 


comp=2 


bytes=1313 


iev=3 


layer=3 


comp=0 


bytes=1214 
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I Table 6 (continued) 



lev=3 


iayer=3 


comp=1 


bytes=1798 


lev=3 


layer=3 


comp=2 


bytes=2585 



5 

[0117] In another embodiment, the ordering could be by layer. Thus, the information ak>ove is consolidated for each 
level (not segregated by level or component), as shown below: 

Ordering by iayer^O bytes=7959 bitrate=0.971558 PSNR=30.7785 
10 Ordering by iayer=1 bytes=10877 bitrate= 1.327759 PSNR=32.0779 

Ordering by iayer='2 by tes= 16560 bitrate=2.021464 PSNR=35.7321 



[0118] Distortion by layers can be based on PSNR. For example. 



layer=0 PSNR=30.7785 
layer=1 PSNR=32.0779 
layer=2 PSNR=35.7321 



[0119] In an alternative emt>odlment, such Information may be hidden In the codestream as described above. The 
information may be used to control rate distortion. 

[0120] In another embodiment the layers may be predefined for a particular viewing distance. In such a case, the 
data is divided into layers from the highest frequency, lowest resolution to the lowest frequency, highest resolution. 
[0121] In one embodiment, the layer information indicates the summation of bits across the entire image for that 
layer and all previous layers (for example the 16.011 bits listed next to layer 1 indicates the total number of bits for 
layer 0 and layer 1). Alternatively, bytes, words, kilobytes, or other units of memory or rate could be used Instead of 
bits. Table 7 shows this type of absolute rate information. 

[0122] Table 8 shows relative rate information. Layer 0 has 4096 bits, layer 1 has 11,915 bits, etc. 



Table 7 



35 



layer 


Rate (bytes) 


0 


4,096 


1 


16.011 


2 


40.000 


3 


100,000 


4 


250,000 


5 


500.000 


6 


1.000,000 


7 


2,500.000 


8 


5.500.000 



Table 8 



layer 


Rate (bytes) 


0 


4,096 


1 


11.915 


2 


23.989 


3 


60,000 


4 


150,000 


5 


250,000 


6 


500,000 


7 


1.500,000 


8 


3.000,000 



[0123] For example. If only 750,000 bytes may be allowed in the decoded image, then ail that can be decoded (as 
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the 1.000.000 bytes tabulated with layer 6 includes the 500,000 bytes of layers 0-5) is through layer 5 and half of 
importance layer 6. In some embodiments, no packets from layer 6 would be included. In other emt>odIments, some 
packets from layer 6 would be included and others would be replaced by zero packets so that the total amount of layer 
6 data was approximately 250.000 bytes. 

[0124] Figure 22 illustrates an example of layering for a 5,3 irreversible transform with three levels. MSE or similar. 
Referring to Rgure 22, there are 45 layers shown. Each addittonal layer improves MSE in an order that gives good 
rate-distortion for MSE. 

[0125] Rgure 23 illustrates another example in which transform has 5 levels and the data is divided up into layers 
0-3. l^yer 0 corresponds to the thumbnail version, layers 0-1 correspond to the monitor (or screen) resolution, layers 
0-2 correspond to the print resolution, and layers 0-3 correspond to lossless. 

[0126] In an alternative embodiment, the layers may be predefined for some other distortion metric (e.g., MSE, 
weighted MSE. sharpness of text etc.) 

[0127] The decoder uses the information regarding the layers from the codestream to select layers to generate an 
image. The decoder knowing what the desired viewing characteristics from the application or implementation (see 
Table 9 below), and using the information from the codestream specifying the layers, can quantize the codestream in 
order to display an image at the correct viewing distance. Figure 9 illustrates such a decoder. Referring to Rgure 9, 
decoder 901 receives a codestream and includes quantization logic 902 that examines the COM marker and uses 
information about the viewing distance it is at stored in storage 903 to generate quantized codestream 904 via. for 
example, selecting the proper layers. Quantized codestream 904 is decoded by decoding logic 905 (e.g.. a JPEG 2000 
decoder) after selecting layers to generate an image data 906. A naive decoder would simply ignore the data in the 
comment marker. 

[0128] Figure 10 is a flow diagram of a process for using layers when decoding. The process Is performed by process- 
ing logic that may comprise hardware (e.g., dedicated logic, circuitry, etc.), software (such as is run by, for example, a 
general purpose computer or a dedicated machine), or a combination of both. 

[0129] Referring to Figure 10. the process begins by processing logic recehfing a codestream of compressed logic 
data (processing block 1001). The image data is organized into multiple layers, each of which comprises coded data 
that adds visual value to the image (e.g.. look sharper, better defined, better contrast etc.). Next processing logic 
selects one or more layers for quantization based on sideband information (processing block 1002). After selection, 
processing logic decompresses the non-quantized layers of the codestream (processing block 1003). 

Editing of Tiles, Tlle-parts» or Packets 

[0130] Once a codestream is created, it may be desirable to edit parts of the image. That is, for example, after 
performing encoding to create the codestream, a set of tiies may be decoded. After decoding the set of times, editing 
35 may be performed, followed by encoding the set of tiles with the edits to the same size as the encoded tiles were prior 
to their decoding. Examples of typical editing include sharpening of text and removing "red-eye." The JPEG 2000 
codestream can be edited in memory or in a disk file system without rewriting the entire codestream. 
[0131] Figure 11 is a flow diagram of one embodiment of an editing process. The process is performed by process 
logic thai may comprise hardware (e.g., dedicated logic, circuitry, etc.), sofhfvare (such as is mn by, for example, a 
40 general purpose computer or a dedicated machine), or a combination of botii. 

[0132] Referring to Figure 11 , processing logic initially determines the tiles, tile-parts, or packets that cover ttie area, 
resolution, components, and/or precincts to be edited and decodes them (processing block 1101). This determination 
may be made in response to a user selecting an area and/or working resolution. The determination may use editing 
information for a higher resolution to determine which parts or tiles cover the portion to be edited. Once decoding has 
45 been completed, processing logic performs the desired edits (processing block 1102). 

[0133] Alter performing the desired edits, processing logic recompresses the data into coded data (processing block 
1103) and creates a replacement tile, tile-part, or packet for the codestream (processing block 1104). In one embodi- 
ment in creating the replacement tile, tile-part, or packet, processing logte pads out the data with bytes at the end of 
ttie codestream if the new data is smaller than ttie unedited version of the data to make the replacement tile, tile-part 
so or packet the same size as the unedited version. 

[0134] In an alternative embodiment, processing logic may use a marker, or tag. such as a COM maricer segment 
of tiie appropriate length instead of the padding. The COM maricer could be used to fill space or could contain infomnation 
that the encoder wanted to include. It could contain information such as. for example, sideband information described 
herein or a copyright license for an Image or text or other file format information. 
55 [0135] in one embodiment in creating the replacement tile, tile-part, or packet processing logic truncates ttie last 
packets for any or all components until the data fits In the codestream if the new data Is larger than the unedited version 
of the data. 

[0136] Editing of an image may be performed by changing coded data for tiies, tile-parts, or codeblocks. In one 
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embodiment, editing is perfomned without changing file size by quantizing instead of expanding. In another embodiment, 
a predetemnined amount of extra space is aiiocated per tile or per codeblock to allow for a predetermined amount of 
expansion. In still another emkxKliment, coded data may be put at end of files by manipulating tile headers and putting 
invalid tile data in COM markers. 

s [0137} Note that If there are subsequent tile-parts that depend on the data in the portion of the codestream that is 
t>elng edited, these tile-parts may become useless in the codestream. An indication of this useless data may be noted 
to the decoder by one of several methods. These methods invohre inserting or modifying information in the codestream 
to indicate the presence and/or location of the useless data. In one embodiment, the application uses a status buffer 
to indicate that the data in tile-parts subsequent to an edited tile-part may be useless. The status buffer may be in 

10 workspace memory and describes dependencies between packets. If an earlier packet is altered, the subsequent 
packets cannot be decoded as is. These subsequent packets must be edited accordingly or eliminated. In another 
emkKxliment. such an indication may be made by zeroing out the data section of those tile-parts and/or creating a PPT 
marker segment that denotes no data. 

IS Optimal Encoder Quantization 

[01 38] During encoding, unquantized coefficients from some or all subbands may t>e divided by a value of Q to create 
the quantized coefficient values. This value Q may have a wide range of values. Typical encoders quantize a number 
of the values In a single particular range of values is made equal to one single coefficient value. In essence, all the 

20 coefficients in the particular range are quantized to the same value. This can be exemplified by Rgure 12 which shows 
that the range of values is often in a t>ell shaped curve and that all of the values In the particular range, such as range 
are sent to the decoder as one quantized value, such as , and the decoder will reconstruct these values to a 
particular value. Assume a decoder reconstructs these values to a predetermined value (e.g., floor 0^ min 14 max), 
or rhin -^MiQ, where Q is the quantization step size). For example, if the range of values is k>etween 16 and 31 , then 

25 the decoder may assume the value is 24. In one embodiment, instead of using 1/2 as the value, another value is 
selected, such as floor (3/8 min + 5/8 max), or min * 3/8Q, where Q is the quantization step size. Therefore, if the 
range is from 16 to 31, then it is assumed that the decoder will reconstruct the value to 22, Instead of 24. 
[0139] In some cases, two spatially adjacent coefficients may t>e dose to each other numerically yet in separate 
quantization bins, such as coefficient values 1201 of range R2 and 1202 of range in Figure 12. The results of the 

30 quantization may cause an artifact to occur. In one emt>odiment, for coefficients near a boundary between two quan- 
tization bins, the encoder selects a bin such as Range Into which a coefficient, such as coefficient 1201, will be 
quantized so that It is consistent with neighbors, such as coefflcient 1202. This helps avoid artifacts. That is, this 
technique reduces distortion yet may increase rate, particuiariy when a coefficient is moved from a smaller bin to a 
higher bin. 

35 . 

Flicker Reduction for Motion JPEG 

[0140] At times, flicker occurs when applying wavelet compression to motion sequences. An example of such flicker 
may Include the image getting brighter or darker in areas or tiie appearance of edges changing in successive frames 
40 as the motion sequence is played (mosquito noise around the edges). The flicker may be due to the application of 
different local quantization to successive frames of a motion sequence or to noise exacerbated by quantization that is 
viewed temporarily. 

[0141] To reduce flicker, coefficients that are in the same position and close to the same value In successive frames 
are forced to the same value. That is. the coefficients values in successive frames are set to a predetermined value. 
45 This is essentially a form of quantization that is applied during encoding. Figure 1 3 is a flow diagram of one embodiment 
of a process to reduce flicker. 

[0142] A test of whether to apply such quantization to a coefficient value in a subsequent frame is based on the 
quantization that was performed on the coefficient In the previous frame. Thus, the encoder is utilizing firame depend- 
ency to eliminate flicker while the decoder decodes data frame by frame independenfly. 
so [0143] In one embodiment, in order to reduce flicker in motion JPEG, coefficient values are modified (quantized) 
based on their relationship with each other with respect to a threshold. For example, if Dn and Dn-i-l are the corre- 
sponding coefficient (same spatial location and same subband) in two frames before quantization, if D'n and D'n-*-1 
represent these coefficients after quantization, if Q(*) are scalar quantization, and if the value T is a threshold, then 
the following may be applied: 

55 

if(|Q(Dn+1)-(D'n)I<T) 
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D'n+1 = D'n 

else 

5 

D'n+1 =Q(Dn+1) 

For example, the value T may be twice the quantization step size. Other values of T include, but are not limited to, 

10 j2Q,\3Q,2j2Q. 

[0144] One of the coefficient values may be modified to be either a predetermined closeness to another coefficient 
value. The closeness may be determined by some threshold. The threshold may be user set or adaptWe based on 
some criteria. The threshold could be different based on the subband and, perhaps, on the persistance of the particular 
value (number of frames that this coefficient is dose). In one embodiment, the coefRdent value is set equal to the other 
f 5 .coefTident value. Jn alternative emt>odiments,.the coeffident is set to be within the quantization bin size of the other 
coeffident value or twice the quantization bin size. 

[0145] Figure 14 illustrates one embodiment of an encoder (or portion thereof) that performs the quantization de- 
scnbed above. Referring to Rgure 14, a quantizer 1400 receives coefftdents 1410 for frames of a motion sequence 
from a wavelet transform (not shown). The coeffidents are received by quantization logic 1401 which compares a 
20 threshold value stored In memory 1 401 to coeffident values for the previous frame that are stored in memory 1403 to 
coeffidents 1410 with a scalar quantizer Q applied from memory 1404. 

[0146] Quantization logic 1401 may comprise comparison hardware (e.g., logic with gates, drcuitry, etc.) or software 
to perform the comparison. This comparison hardware and software may implement a subtractor or subtraction oper- 
ation. The results are a quantized codesteam (assuming some values have been changed.) 
2S [0147] This may be applied over two or more frames. Also the comparison is not limited to two consecutive frames. 
The comparison can be over 3. 4. 5, etc.. frames, for example, to determine if a variance exists. Figure 24 illustrates 
one example in which values in a first and third frame are used to set the value in the second frame. 
[0148] Note that the quantization can also be codestream quantization with a code block-based rule. 

30 Rate Control, Quantization, and Layering 

[0149] In one embodiment selective quantization of coeffidents can be performed during encoding by setting a 
subset of the refinement bits to be the more probable symbol (MPS). This may be performed at a user selected bKplane. 
For examples, if there Is text on a background image, with a goal of having sharp text images while minimizing coded 
35 data required for the background, the refinement bits that are set to MPS are those that do not effect text for the last 
bitplane, while using the actual value for bits that effect text. 

[0150] Such a quantization scheme may be used to implement non-uniform quantization step sizes. For example, if 
one wanted to have a background with fewer bits, setting the refinement bits to the MPS could operate as a form of 
quantization. This quantization scheme causes some level of distortion but lowers the bit rate necessary to transfer 
40 the codestream. 

[0151] Note that although this technique may be applied to bits generated during the refinement pass, the technique 
has application to ottier compression schemes (e.g., lists generated during subordinate passes, tail bits of CREW of 
Ricoh Silicon Valley. Menlo Park, Califomia. MPEG IV texture mode, etc.). 

[0152] In one embodiment, the same technique may be applied to other changes between frames. That is. in one 
45 embodiment, a change due to a rate distortion in one frame may be perfonmed in a subsequent frame to avoid distortion 
effects. 

Rate Control and Quantization 

so [0153] In one embodiment, user specified quantization is provided. For a 3 level transform for one component. 7 
quantization values are suffident: level 1 HH, level 1 HL and LH, level 2 HH. level 2 HL and LH. level 3 HH. level 3 HL 
and LH. and level 3 LH. 

[0154] If quantization values are bitplanes to truncate (which is equivalent to scalar quantization by powers of 2), 
3-blt values (0...7) are sufTicient for most applications. (For image components with depth 12-bits or more and 5 or 
55 more transform levels, pertnaps higher quantizations might be useful.) Values 0...6 could be used to spedfy the number 
of bitplanes to truncate and 7 could be used to mean discard all bitplanes. The three bit values may be written to a 
controller that controls compression (or decompression) hardware (e.g.. JPEG2000 compatible hardware) to perform 
the quantization. 
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[0155] For 3 component color quantization: 

• 21 values can t>e used with separate values for each component, 

• 14 values can be used, 7 for luminance and 7 for chrominance, 

5 • ^7 values can be used for 4:1 :1 subsampied data. 7 for luminance and 5 for each chrominance component 

• 12 values can be used for 4:1:1 subsampied data, 7 for luminance and 5 for chrominance. 

• 19 values can be used for 4:22 suk>sampled data, 7 for luminance and 6 for each chrominance component, and 

• 13 values can be used for 42:2 subsampied data. 7 for luminance and 6 for chrominance. 

10 Since 21 *3 = 63 bits is less than 8 bytes, transferring or storing the quantization uses little resources. A central process- 
ing unit (CPU) might select one predetermined quantizer from a table and write it to a CPU or other controller controlling 
special purpose JPEG 2000 hardware (a chip) for each frame of a motion JPEG 2000 video sequence. Alternatively, 
one Implementation of JPEG 2000 might have a small memory that holds 8 or 16 different quantizers that could be 
selected for each frame. 

IS [0156] Quantizers can also be used to assign bitpianes to layers. For example, Qq, Qv and Q2 may be quantizers 
that specify bitpianes of coding pass to quantize. Quantizer Q^ causes the most loss, while quantizer Q2 causes the 
least loss. Layer 1 is ail the data quantized by Qq but not quantized by Q^. Layer 2 is all the data quantized by Q^ but 
not quantized by Q2. Layer 3 is all the data quantized by Q2. 

20 Simple Quantization 

[0157] Figures 17 and 18 show example quantizers (lat>el A...Q) for the 3-level 5/3 transform as the number of 
coefficient LSBs to truncate or not code. Truncating N bitpianes Is equivalent to a scalar quantizer of 2^^. The subband 
where the quantization changes with respect to the previous quantizer Is highlighted with a dashed box. The quantizers 
25 D, K and Q all have the same relationship l^etween the subbands. Ottier quantizers might be used that are better for 
MSE or for other distortion metrics. 

[0158] The exemplary Verilog below converts a single quantization value "q" into seven quantizers (numt>er of LSBs 
to truncate). The variable q^l.HH is used for level 1 HH coefficients, the variable q^l.H is used for level 1 HL and 
LH coefficients, etc. Some consecutive values of q result in the same quantizer 0 and 1; 2 and 3; 4 and 5; 8i+6 and 
30 81-1-7 for all Integers I witii i ^ 0. 
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module makeQ(q, q_lHH, q JH, q_2HH, q.2H, q_3HH,q.3H, 
q-3LL); 

input [5:0] q; 
output [3:0] q^lHH; 
output [3:0] q_lH; 
output [3K3] q_2HH; 
output [2:0] q_2H; 
output [2:0] q_3HH; 
output [2:0] q^3H; 
output [2:0] q_3LL; 

wire [3:0] teinp_2H; 
wire [3:0] temp_3HH; 
wire [3:0] temp_3H; 
wire [3:0] temp_3LL; 
wire [2:0] qlo; 
wire [2:0] qhi; 

assign qlo = q[2:0]; 
assign qhi = q[5:3l; 

assign q.lHH = qhi + ((qlo >= 2) ? 1 : 0); 
assign q^lH = qhi + ((qlo >= 4) ? 1 : 0); 
assign q.2HH = qhi + ((qlo >= 6) ? 1 : 0); 
assign temp_2H = qhi + ((qlo >= 1) ? 0: -1); 
assign temp_3HH = qhi + ((qlo >= 3) ? 0: -1); 
assign teinp_3H = qhi + ((qlo >= 5) ? 0: -1); 
assign teinp_3LL = qhi - 1 

assign q_2H = (teinp_2H < 0) ? 0 : temp^2H; 
assign q3HH = (temp_3HH < 0) ? 0 : teinp_3HH; 
assign q_3H = (temp.SH < 0) ? 0 : temp.3H; 
assign q^3LL = (temp_3LL < 0) ? 0 : temp J3LL; 

endmodule 

Human Visual System Weighting for Color and Frequency 

[0159] Table 9 shows additional bitplanes to quantize (e.g.. truncate) for iuminance to take advantage of the frequency 
response of the Human Visuai System (from Table J-2 of the JPEG 2000 standard). A viewing distance of 1000 pbceis 
might be appropriate for viewing images on a computer monitor, larger viewing distances might be appropriate for 
print Images or television. 
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Table 9 - 



5 



10 



Human Visual System Weighting for Luminance 


subband 


extra biplanes to quantize for viewing distance of... 


1000 pixels 


2000 pbcels 


4000 pbcels 


1HH 


2 


4 or 5 


discard all 


1HL, 1LH 


1 


2 or 3 


6 


2HH 




2 


4 or 5 


2HL, 2LH 




1 


2or3 


3HH 






2 


3HL. 3LH 






1 



Additionally chrominance may be quantized more heavily than luminance. 

[0160] Rgure 19 shows a quantization that starts with Figure 17(D) and then adds frequency weighting for a 1000 
pixel viewing distance (to both luminance and chrominance), keeps 3LL chrominance unchanged, discards 1HL and 
1HH chrominance for 4:2:2 and additional 2 bitplanes are discarded for the remaining chrominance. 
[0161] Sharp text without ringing artifacts is more desirable than exact gray value for text/baclcground. That is, if a 
^ gray level is supposed to be at 50% (for example), and is instead at 60%, it is often not visually objectionable if the 
image is of text In one embodiment, the LL (DC) coefficients are quantized more heavily for text than for non-text 
images at low bitrate. For example, for an 8-blt image component, a quantiation step size of 8, 16 or 32 might be used 
for text only regions and a quantization step size of 1 . 2 or 4 might be used for regions containing non-texL This allows 
more fidelity for the high frequency coefficients, thereby resulting in text with sharp edges. 

25 

Using Quantizers to Divide Things into Layers 

[01 62] Table 1 0 shows 1 6 example quantizers. Quantizer 1 5 is lossless. Quantizer 8 is the same as Figure 1 9. These 
can be used dhfide the subband bitplanes into layers. 

30 



Table 10 





subband 


0 


1 


2 


3 


4 


5 


6 


7. 


8 


9 


10 


11 


12 


13 


14 


15 




Y1HH 


all 


all 


6 


6 


5 


5 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


35 


Y1HL. 
LH 


6 


5 


5 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


0 


0 




Y2HH 


5 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


0 


0 


0 


0 




Y2HL. 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


0 


0 


0 


0 


0 


40 


LH 


































Y3HH 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


0 


0 


0 


0 


0 




Y3HL. 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


0 


0 


0 


0 


0 


0 




LH 




































Y3LL 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


45 


1HL. 










HL and HH always discarded for 4:1 :1 or 4:2:2 only 










HH 




































1LH 


all 


all 


all 


all 


6 


6 


5 


5 


4 


4 


3 


3 


2 


2 


1 


0 




Ci2HH 


ail 


6 


6 


6 


5 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 




Ci 2HL, 


all 


6 


5 


5 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


0 


SO 


LH 




































Ci 3HH 


all 


6 


5 


5 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


0 




Ci 3HL. 


all 


5 


5 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


0 


0 




LH 


































55 


Ci 3LL 


all 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 




C2IHL, 










HL and HH always discarded for 4:1 :1 or 4:2:2 only 










HH 
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Table 10 (continued) 



subband 


0 


1 


2 


3 


4 


5 


6 


7. 


8 


9 


10 


11 


12 


13 


14 


15 


C2 1LH 


all 


all 


all 


alt 


6 


6 


5 


5 


4 


4 


3 


3 


2 


2 


1 


0 


C2 2HH 


all 


6 


6 


5 


5 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


C2 2HL. 


all 


6 


5 


5 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


0 


LH 


































C2 3HH 


all 


6 


5 


5 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


0 


C2 3HL. 


all 


5 


5 


4 


4 


3 


3 


2 


2 


1 


1 


0 


0 


0 


0 


0 


LH 


































C. 3LL 


all 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 



[0163] Layer 0 contains all data not quantized away by quantizer 0. This would k>e luminance data only: aH of 3LL; 
all but 4 bltplanes of 2HL. 2LH. 3HL. 3LH and 3HH; all but 5 bitplanes of 2HH and all but 6 bitplanes of 1HL and 1LH. 
Layer 1 contains all data not in layer 0 and not quantized away by quantizer 1 . This would be luminance bitplanes 5 
for 1HL and 1LH, bitplane 4 for 2 HH, bitplane 3 for 3HL and 3LH: all 3LL chrominance; ail but 5 bitplanes for chromi- 
nance 3HL and 31IH; and all but 6 bitplanes for chrominance 2HL. 2LH and 3HH. Rnalty. layer 15 would contain the 
LSB of 1 LIH chrominance. 

Rate Control with Multiple Layers and Tile-Parts 

[0164] There several well known techniques for rate control in compression systems. The slmpllst method is to pick 
a distortion for every "unit" compressed (a unit may be an 8x8 blocl^ in JPEG, a frame in a motion sequence, a tile of 
a single Image, a subband of a tile in a wavelet coded image, etc.). If the distortion selected leads to a bitrate higher 
than the desired average bitrate, the distortion allowed is increased for new units as they are compressed. If the dis- 
tortion selected leads to a bit rate lower than the desired average bitrate. the distortion allowed is decreased for new 
units as they are compressed. 

[0165] A more complex method buffers the compressed data from some number of "units." The bitrate and/or dis- 
tortion for each unit at each distortion level is stored. Then the distortion to allow across all the units in the buffer is 
detemiined v^en the buffer is full, if the buffer is sufficient to contain the entire image, extremely high quality results 
can be obtained. In JPEG 2000, layers are designed to contain increments to quality. Thus, selecting a distortion can 
mean selecting the number of layers to use for each code block or tile. A complete description of this type of rate control 
is in, David Taubman, 'High Pertormance Scalable Image Compression with EBCOT," IEEE Transactions on image 
Processing. 

[0166] There are several disadvantages to this process. One disadvantage is that a buffer memory for the entire 
codestream is required. A second disadvantage is that the latency (time until any of the codestream is output) is high. 
A third disadvantage Is that the second pass could take large amount of time. 

[0167] To mitigate these problems, each tile of a JPEG 2000 codestream is encoded as described at>ove with at 
least two layers. At the completion of encoding each tile, a number of packets (e.g., layer, resolution, precinct tile- 
component) are output to the codestream as a complete tile-part. The remaining layers are stored in the buffer. A 
second pass through the remaining coded data in the buffer is optional. During this second pass, extre packets from 
each tile are appended to the codestream as complete tile-parts as space or time allows, ff in a fbced-rete application, 
then only packets within the given rete are appended. If in a fixed time application, then only number of cycles allowed. 
One embodiment of this process is shown in Figure 1 5A. Thus, these can be the 2 complete tile-parts output for each tile . 
[0168] Rgure 15B Illustrates a number of layers, layers 1-n. Layer 1 is output on the first pass, and the remaining 
layere are most likely below fixed-time or fixed-rate time limits. Layer 2 may be output on a second pass within fixed- 
time or fixed-rate requirements while achieving similar distortion over ali the components. 

[0169] The atx^ve process is advantageous in that it allows the buffer to store a fraction of the coded data, the first 
data can t>e output (transmitted or stored) sooner, and the second pass through the data can be faster because there 
is less data to process. Also less memory is required for buffering. 

[01 70] The criterion for selecting which packets go into the first set of tile-parts can be similar to any other rete control 
algorithm. In one embodiment, the rate of packets can be less than the desired average bitrate for the whole image. 
For example, if a final compressed bitstream at 2.0 bpp is desired, the firet pass could place 1.5 bpp for every tile in 
the codestream, and buffer 1 bpp for every tile. 

[0171] The second pass can select from the remaining data the packets to place in the second tile part of each tile. 
Thus, to obtain a 2.0 bpp average encoding, some tiles that had high distortion after the first pass could receive ail the 
remaining data saved for tiie tile, while other tile parts whteh had low distortion after the first pass might not have any 



21 



EP 1233 624 A1 



additional data transmitted. 

Rate Control for Compressed Codestream Data 

s [0172] Some rate controi techniques described herein include rate controi p>erf6nned on a compressed codestream 
based on a request Implemented by selecting some number of layers to keep In the codestream. A parser may be 
used to produce a new codestream which shows the bitrate based on layers. This bitrate is equal to or less than the 
bitrate specified by the request 

[0173] The parser may use a data structure referred to herein as a "packet structure.* Note that this data structure 
10 may be used for other purposes such as, for example, the versatile pocket data structure descrit>ed below. In one 
embodiment the packet structure includes a packet start pointer and packet length. It also contains a tile numfc>er, a 
resolution, a component layer, and a precinct the packet belongs to. Finally, it also consists of a selection flag. This 
flag, when set to a predetermined value (e.g., 1), Indicates if the packet is selected In the array for writing out to a new 
codestream. 

IS [0174] In one embodiment packets are read in sequential order from a codestream based on the progression order 
information indicated by the COD marker. 

[01 75] The number of bytes is computed based on the bitrate desired by the request The number of bytes belonging 
to layer 0 is added up to a total. Then this total of bytes Is compared with the number of bytes desired, if the total is 
less than the number of bytes desired, one addrtionai layer Is added to the total. The process continues until the total 
20 is equal to or greater than the number of bytes desired or all packets have been added. 

[01 76] During the process, those packets which have been added to the total, are marked as selected by the selection 
flag In the structure. 

[0177] If the total is equal to the number of bytes desired, the addition process is stopped. If the total exceeds the 
number of bytes desired, the packets in the last layer added are subtracted from the total. This is done to guarantee 
25 that the bitrate is below the bitrate desired. Consequentiy, during the subtraction step, packets which have been sub- 
tracted from the total are marked unselected. 

[0178] In one emtx>diment the related markers such as SOT, COD. PLT are updated according to the request 
Packets are written to the new codestream. The packet structure may be created using the following: 

30 
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10 



IS 



20 



25 



typedef struct .PACK. { /* packet structure */ 

int start; /* packet starting point */ 

int length; /* packet length V 

unsigned short t; /* tile number the packet belongs to */ 
unsigned short r; /* resolution the packet belongs to */ 
unsigned short c; /* component the packet belongs to */ 
uiTsigned short I; /* layer the packet belongs to V 
unsigned short p; /* precinct the packet belongs to */ 
unsigned char select; /* selection flag */ 
) Packet; 

/* Store packets from tp->tile[i]5i2e[]] array to the packet structure array 
*//* Layer progression (LRCP) order */ 

if(progressioi\.order == 0){ - 



j = 0; 

for(i=04<number_of_tileA++){ 
m = 0; 
for(l=0;l<layer;l++){ 
for(r=0;r<resolution+l;r-M-){ 
f or(c=0;c<component;c++) { 
fbr(p=0;p<precinct[rl,-p++){ 

tp->pk[j]-start = tp->tile(i].pointer[ml; 
tp->pkO].length = tp->tile[i]Si2e[m]; 
35 • totaljength += tp->tile[i].Si2e[ml; 

tp->pk[jl.t=i; 
tp->pk(j].r = r; 

40 



tp->pk[j].l = l; 

4s tp->pkQ].c = c; 

tp->pk[j].p = p; 
m++; 

50 1 

} 

} 

} 

55 num.packet[il = m; 

} 

1 
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Versatile Packet DataiStrvcture 

[01 79] The same packet data structure described above can be used to facilitate other parsing options, once packets 
are read into the structure. 

5 [0180] For resolution parsing, the packets which are to be excluded are marked unselected. For example, given a 
4 resolution codestream, and a request is to produce a 3-resolution codestream, a parser marks all packets which 
belong to resolution 4 unselected. Then the newly produced codestream contains only packets from resolution 1 up 
to resolution 3. 

[0181] SimDarfy, for component parsing, progression conversion parsing, quality parsing can be performed step by 
10 step processing the packets in the structure. 

[0182] The packet data structure can handle complex requests. For example, a request which requires the parser 
to produce a codestream which has a 3-resolution. 2-layer, and 1 -component codestream. 

Clipping After Each Inverse Transform 

IS _ . 

[01 83] As a result of quantization performed on wavelet coefficients, the final decoded pbeels are often outside of the 
original range of allowed pixels from the specified bit depth. Typically, these pbcels are clipped to the original range so 
that further image processing or display devices can use the original bit depth. 

[0184] For example, an eight bit image has pixel values between 0 and 255, inclusive. After lossy compression is 
20 used, the decoded image may contain values like -5 and 256. To provide an eight bit output, these values are clipped 
to 0 and 255 resp>ectively. This clipping procedure always reduces pixel wise distortion because the original image did 
not contain pixels outside of the clipping bounds. This procedure Is well known and recommend by the JPEG 2000 
standard. 

[01 85] In addition to the bounds on the final output samples, there are bounds on the values coefRdents can assume 
25 at the various stages of the wavelet transform. Just as quantization can change the final decoded samples to lie outside 
the original bounds, quantization can change the partially transformed wavelet coefficients to lie outside their original 
bounds. If these coefficients are clipped to their original bounds, distortion will decrease. 

[0186] For example, after a horizontal (one dimensional) 5-3 reversible transform as specified by JPEG 2000 with 8 
bit input samples, the maximum value of the low pass coefficient is +191. and the minimum possible value is - 191. 

30 The high pass coefficient must be between -255 and 255 inclusive. After the vertical one dimensional transform, the 
Low-Low coefficients are k)Ounded by -286 and 287. Thus when decoding an eight bit Image, when the first level low- 
low pass coefficients are generated (by the inverse wavelet transfrom from a higher level), the coefficients can be 
clipped to -286 and -1-287. and distortion will decrease. Likewise after the first level vertical Inverse transformation is 
done, the low pass coefficients can t>e clipped to - 191 . -(-191 , and the high pass coefficients can be clipped to -255. 255. 

35 [0187] For each subband. each filter, each transform level, and each image depth, there is a different maximum and 
minimum value for the coefficients. These maximum and minimum values can be computed by finding the signal that 
leads to the maximum and minimum and running the fonvard compression system and recording the maxima. The 
signals that lead to extreme values come from inputs where each pixel is either a maximum or minimum. Which pbcels 
should be maximum and which pixels should be minimum can be determined by convolving sequences which are -1 

40 when the wavelet coefficient is negative and -i-l when the wavelet coefficient is negative. For the 5-3 filter used In JPEG 
2000 Part I, the low pass signal of interest is [-1-H-H-H-1] and the high pass signal is [-l-i-l-l]. 
[0188] The signal (image) which will generate the largest LL value is: 
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(where -i-l must t>e replaced by the input maximum (e.g., 255) and -1 must be replaced by the input minimum (e.g., 0). 
[0189] For irreversible fitters, it is not necessary to actually run the system to determine the maxima, simply convolving 
the wavelet coefficients is sufficient. For the reversible 5-3 fitter, however, the fioor function is used in the computation 
of coefficients and Is also used to determine the correct maxima. 
55 [0190] Note that this may be used for other filters (e.g.. a 9-7 filter)- 

[0191] Figure 28 is a flow diagram of one embodiment of a process for applying an inverse transform with clipping 
on partially transformed coefficients. The process is performed by processing logic, which may comprise hardware (e. 
g., circuitry, dedicated logic, etc.), software (such as that which runs on a general purpose computer system or a 
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dedicated machine), or a combination of both. 

[0192] Referring to Figure 28. processing logic applies a first level inverse transform to coefTictents (processing block 
2801). Thereafter, processing logic dips the partially transformed coefficients to a predetermined range (processing 
block 2802). Next processing logic applies a first level inverse transfomi to the clipped coefficients (processing block 
s 2803) and dips the partially transformed coefTidents to a predetermined range (processing block 2804), which Is dif- 
ferent than the range in processing block 2802. Again, processing logic applies a first level inverse transform to dipped 
coeffidents (processing block 2805) and dips the partially transfomied coefficients to still another predetermined range 
(processing block 2806). 

io Simplified Colorspace Handling 

[0193] A typical decoding process induding color management is shown in Rgure 25. Referring to Figure 25, a file 
with a file format (e.g.. a file format described In the JPEG 2000 standard) oontaining a restricted ICC profile Is provided 
to a decoding device. Decompression block 2501 decompresses the file by taking the codestream portion of the file 

IS and performing context modeling, entropy decoding, and applying an inverse wavelet transform, but does not perform 
color space operations. If the codestream indicates the RCT or ICT component transform should be used to decode 
the codestream. these will be performed by block 2502. That is. inverse RCT/ICT block 2502 takes the components 
and the "RCT Y/N" Indication (RCT if yes. ICT is no) and performs the specified inverse transform and provides (non- 
display) RGB pbcels. (If specified by the syntax, inverse level shifting is also performed.) 

20 [0194] Finally, the ICC color profile from the file format along with Information about the display device vAW be used 
to produce the output pixels. 

[0195] Inverse ICC block 2503 recehres the (non-display) RGB pixels and the ICC profile and applies an Inverse 
color space transform to provide display RGB pixels. 

[0196] Figure 28 illustrates one embodiment of a non-preferred camera ertcoder. Referring to Rgure 26. a camera 
25 generates YCrCb pixels. A converter 2602 converts the YCrCb pixels to RGB pixels and provides those two a typical 
JPEG 2000 encoder. The encoder comprises a RCT to ICT converter 2603 followed by a compressor 2604. The com- 
pressor generates an ICC^ for codestream. 

[0197] Figure 27 illustrates one embodiment of a simpler camera encoder. That is, instead of induding RCT/ICT 
converter 2603 and compressor 2604, a simple camera encoder Indudes only compressor block 2702. Referring to 
30 Figure 27, a camera 2701 generates YCrCb pbcels and provides them to compressor 2702. Compressor comprises a 
JPEG 2000 encoder without an RCT conversion and generates an ICCb codestream with RCT equaling 1 (with syntax 
signaling that the inverse RCT should be used on decoding). The relationship between ICCg and ICC;^ is given by the 
following equation: 

ICCb = IC^A YCrCb*^ RCT 

where represents function composition. 

[0198] Restricted ICC profiles are "syntaxes" for functions on pbcels. A camera will typkally write the same profile 
40 for all Images, so ICCb computed offline, and copied into each output file. In a prior art system there must be HW 
for YCrCb-^ and RCT/ICT which operates on every pixel. 

Coding 4:2:2 and 4:1:1 Data as 4:4:4 Data with Quantization 

4S [0199] The JPEG 2000 standard is typically used to handling data in a 4:4:4 format. It is not capable of describing 
how to reconstruct data in 4:1 :1 or 4:22 formats in a 4:4:4 format for output, in one emtx>diment when encoding 4:1 : 
1 data, the encoder treats 1 HL, 1 LH and 1 HH coeffidents as zero. When encoding 4:2:2 data, the encoder treats 1 
HL and 1 HH coefficients as zero. Thus, with all information in the extra subbands quantized to zero, a decoder is able 
to receive the codestream in a way it expects. In other words, the encoded data resembles 4:4:4 data that has been 

50 heavily quantized. 

File Order for Thumbnail. Monitor, Printer, and Full Resolution and Quality 

[0200] IVIuitiple images at multiple resolutions are important in many image processing situations. Depending on the 
55 application, a user may want to select different images of different resolutions. For example, thumbnail Images may 
be used as an index Into a large number of images. Also, a screen resolution image may be the image used to send 
to a monitor for display thereon. A print resolution image may be of lower quality for printer applications. 
[0201] In one embodiment, a codestream of an image is organized into sections so that different versions of the 
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image, such as. for example, a thumbnail version, a screen version, a print version and a lossless version, is progressh^e 
by quality. 

[0202] in one embodiment, the packets are an^nged such that certain packets con^espond to particular resolutions 
such as a thumbnail. The combination of these packets with other packets represents the monitor resolution image, 

s which when combined with other packets may represent the printer version, etc. Using the POC and tile parts, portions 
of a codestream may be grouped together. For example, all the tiles of the thumbnail size may be grouped together 
followed by tiles for another resolution followed by tiles of another resolution, etc. Figure 21 illustrates an example 
progression with tile parts for a single server. Each tile's thumbnail is grouped in tile-parts at the beginning of a file. 
Figure 21A illustrates that tile^rt 2101 is the only portion that Is used for a thumbnail Image. Figure 21B illustrates 

10 that for a monitor resolution, tile-parts 2102-2104 have been included with tile-part 2101. Figure 21C illustrates that 
for a printer resolution, tile-parts 2105 and 2106 have been included witii tile-parts 2101-2104. Lastfy, Figure 21 D 
illustrates that for a lossless version of the data, the remaining three tile-parts 2107-2108 are included with the rest of 
the tile-parts. These sets of tile parts may be placed on a server in this progressive order. 

[0203] One embodiment of the process for accessing the groupings of tile parts is shown in Figure 16. The process 
IS may be performed by processing logic that may comprise hardware (e.g., dedicated logic, circuitry, etc.), software 
(such as is run on a general purpose computer system or a dedicated machine), or a cornbination of both. The following 
steps assume that the Image has been transformed with sufficient resolution levels and layers to divide the! Image into 
the four sizes. 

[0204] Referring to Figure 1 6, processing logic initially determines the correct resolution and layering for the tiiumbnali 
20 (processing block 1601). In one embodiment, to determine the correct resolution and layering for the thumbnail, 
processing logic creates a POC constrained to that resolution and layer for each tile and then creates a set of tile-parts 
and places this POC for each tile in the codestream. 

[0205] Next, processing k>gic repeats prtx^ssing block 1601 for the monitor resolution given that ttie thumbnail 
packets are already In tiie codestream (processing block 1602). Then, processing logic repeats processing block 1601 
25 for the printer resolution given that tiie monitor packets are already In the codestream (processing block 1 803). 

[0206] L-astly, processing logic creates a POC martcer witti tiie extremes of the resolutions and layers for each tile 
(processing block 1604). In one embodiment, creating the POC with the extremes of the resolutions and layers is 
performed by creating a fourth set of tile-parts with the remaining tile-parts for a lossless version. 
[0207] Note that the particular orders of tiie packets defined in the POCs are not of importance, only the limits. 

30 

An Exemplary Computer System 

[0208] Figure 20 is a block diagram of an exemplary computer system. Referring to Figure 20. computer system 
2000 may comprise an exemplary client 150 or sender 100 computer system. Computer system 2000 comprises a 
35 communication mechanism or bus 201 1 for communicating information, and a processor 2012 coupled with bus 2011 
for processing information. Processor 2012 includes a microprocessor, but Is not limited to a microprocessor, such as, 
for example. Pentium™, PowerPC™, Alpha™, etc. 

[0209] System 2000 further comprises a random access memory (RAM), or other dynamic storage device 2004 
(referred to as main memory) coupled to bus 2011 for storing information and instructions to be executed by processor 
40 2012. Main memory 2004 also may be used for storing temporary variables or other intermediate Information during 
execution of instructions by processor 2012. 

[0210] Computer system 2000 also comprises a read only memory (ROM) and/or other static storage device 2006 
coupled to bus 2011 for storing static infonmation and instructions for processor 2012. and a data storage device 2007. 
such as a magnetic disk or optical disk and its corresponding disk drive. Data storage device 2007 is coupled to bus 

45 2011 for storing information and instructions. 

[0211] Computer system 2000 may further be coupled to a display device 2021 . such as a catiiode ray tube (CRT) 
or liquid crystal display (LCD), coupled to bus 2011 for displaying information to a computer user. An alphanumeric 
input device 2022. including alphanumeric and other keys, may also be coupled to bus 2011 for communicating infor- 
mation and command selections to processor 2012. An additional user input device is cursor control 2023, such as a 

so mouse, trackball, trackpad, stylus, or cursor direction keys, coupled to bus 201 1 for communicating direction Infonnation 
and command selections to processor 2012. and for controlling cursor movement on display 2021. 
[0212] Another device tiiat may be coupled to bus 2011 is hard copy device 2024. which may be used for printing 
instructions, data, or other information on a medium such as paper, film, or similar types of media. Furthermore, a 
sound recording and playback device, such as a speaker and/or microphone may optionally be coupled to bus 2011 

55 for audio Intertadng witti computer system 2000. Another device that may be coupled to bus 2011 is a wired/wireless 
communication capability 2025 to communication to a phone or handheld palm device. 

[0213] Note that any or all of the components of system 2000 and associated hardware may be used In the present 
invention. However, it can be appreciated that other configurations of the computer system may include some or all of 
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the devices. 

[0214] Whereas many alterations and modifications of the present invention will no doubt become apparent to a 
person of ordinary skill in the art after having read the foregoing description, it is to be understood that any particular 
embodiment shown and described by way of illustration is in no way intended to be considered limiting. Therefore. 
5 references to details of various embodiments are not intended to limit the scope of the claims which in themselves 
redte only those features regarded as essential to the invention. 



Claims 

10 

1. A system comprising: 

a memory sized to Include lines to store a band of an Image and additional lines; 
a wavelet processing logic comprising 

IS 

a wavelet transform to generate coefficients when applied to data in the memory; 
access logic to read data from the memory into the tine buffers to supply data stored in the memory to the 
wavelet transform and to store coeffictents In the memory, such that after data stored at a first pair of lines 
is read from memory into the buffers of the access logic, the access logic reuses the first pair of lines to 
20 store coefficients generated by the wavelet transform that are associated with a second pair of lines dif- 

ferent from the first pair of lines. 

2. The system defined in Claim 1 wherein the access logic stores coefficients in contiguous lines of memory with 
coefficients firom the same subband and decomposition level adjacent each other. 

25 

3. The system defined in Claim 1 wherein a first line of each of tiie first and second pairs of lines are located In the 
memory at an offeet with respect to each other. 

4. The system defined in Claim 3 wherein the access logic stores the first outputs of the wavelet transform for each 
30 coefficient level in the additional lines within a distance of the offset. 

5. The system defined in Claim 3 wherein size of ttie offeet is different for each transform level. 

6. The system defined in Claim 3 wherein the size of the offeet is equal to: 

35 2(^'^^"^ 1^^^ ^ coefficient being stored)^ 

7. The system defined in Claim 6 wherein, during decomposition, the offset for storing the first rows of each pair of 
rows of LI coefficients in the memory is two lines from tiie first row of data of the Image associated with said each 
pair of rows of the L1 coefficients, and the ofteet for storing the first row of each pair of rows of L2 coefficients if 

40 four lines from the first row of LI coefficients associated with said each pair of rows of the L2 coefficients. 

8. The system defined in Claim 1 wherein the access logic stores coefficients associated with a decomposition level 
greater than level three in the lines of the memory that previously stored the band of the Image. 

45 9. The system defined in Claim 3 wherein the addition lines relating to the offeet are above the line storing the band 
of the image. 

10. The system defined in Claim 1 wherein the wavelet transform is a forward wavelet transform. 

50 11. The system defined In Claim 1 wherein the wavelet transform is an Inverse wavelet transfonm. 

12. A method comprising: 

reading data from a memory into line buffers to apply a wavelet transform thereto; and 
55 storing coefficients created by applying the wavelet transform at lines In the memory so that each set of co- 

efficients generated from data stored at each pair of lines in the memory is stored in the memory at lines that 
are at an of^et with request to said each pair of lines in the memory. 
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13. The method defiped in Claim 12 further comprising access logic reusing a first pair of lines to store coefficients 
generated by a wavelet transfomn, that are associated with a second pair of lines different from the first pair of 
lines, after data stored at a first pair of lines Is read from memory Into the buffers of the access logic, and wtterein 
a first line of each of the first and second pairs of lines are located in the memory at an offset with respect to each 

5 other. 

14. The method defined in Claim 1 3 further comprising the access logic storing the first outputs of the wavelet transform 
for each coefficient level in additional lines within a distance of the c^fset 

10 15. The method defined in Claim 13 wherein size of the oftseX is different for each transform level. 

16. The method defined in Claim 13 wherein the size of the offset is equal to: 
^Orenslbnn tevsl of oosflldsnt boing stofBd)^ 

IS 17. The method defined in Claim 16 wherein, during decomposition, the offset for storing the first rows of each pair of 
rows of LI coefficients in the memory is two lines from the first row of data of the image associated with said each 
pair of rows of the LI coefRcients, and the offset for storing the first row of each pair of rows of 12 coefFiclents if 
four lines from the first row of L1 coefficients associated with said each pair of rows of the L2 coefficients. 

20 18. The method defined in Claim 12 further comprising access logic storing coefficients associated with a decompo- 
sition level greater than level three in the lines of the memory that previously stored the band of the image. 

19. The method defined in Claim 13 wherein the addition lines relating to the offset are at>ove the line storing the band 
of the image. 

25 

20. An article of manufacture comprising at least one recordable media storing executable instructions thereon which, 
when executed by a processing device, cause the processing device to: 

read data from a memory into line buffers to apply a wavelet transform thereto; and 
30 store coefficients created by applying the wavelet transform at lines in the memory so that each set of coeffi- 

cients generated from data stored at each pair of lines In the memory is stored in the memory at lines that are 
at an offset with request to said each pair of lines in the memory 

21 . The article of manufecture defined in Claim 20 further comprising instructions, which when executed by the process- 
35 ing device cause the processing device to reuse a first pair of lines to store coefficients generated by a wavelet 

transform, that are associated with a second pair of lines different from the first pair of lines, after data stored at a 
first pair of lines is read fi^m memory into the buffers of the access logic, and wherein a first line of each of the 
first and second pairs of lines are located in the memory at an dtfset with respect to each other. 

^ 22. An apparatus comprising: 

means for reading data from a memory into line buffers to apply a wavelet transform thereto; and 
means for storing coefficients created by applying the wavelet transform at lines in the memory so that each 
set of coefficients generated from data stored at each pair of lines In the memory is stored in the memory at 
45 lines tiiat are at an offset with request to said each pair of lines in ttie memory. 
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